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Abstract: Early identification and precise prediction of heart disease have important implications for 
preventative measures and better patient outcomes since cardiovascular disease is a leading cause of 
death globally. By analyzing massive amounts of data and seeing patterns that might aid in risk 
stratification and individualized treatment planning, machine learning algorithms have emerged as 
valuable tools for heart disease prediction. Predictive modeling is considered for many forms of heart 
illness, such as coronary artery disease, myocardial infarction, heart failure, arrhythmias, and valvar 
heart disease. Resource allocation, preventative care planning, workflow optimization, patient 
involvement, quality improvement, risk-based contracting, and research progress are all discussed 
as management implications of heart disease prediction. The effective application of machine 
learning-based cardiac disease prediction models requires collaboration between healthcare 
organizations, providers, and data scientists. This paper used three tools such as the neutrosophic 
analytical hierarchy process (AHP) as a feature selection, association rules, and machine learning 
models to predict heart disease. The neutrosophic AHP method is used to compute the weights of 
features and select the highest features. The association rules are used to give rules between values 
in all datasets. Then, we used the neutrosophic AHP as feature selection to select the best feature to 
input in machine learning models. We used nine machine learning models to predict heart disease. 
We obtained the random forest (RF) and decision tree (DT) have the highest accuracy with 100%, 
followed by Bagging, k-nearest neighbors (KNN), and gradient boosting have 99%, 98%, and 97%, 
then AdaBoosting has 89%, then logistic regression and Naive Bayes have 84%, then the least accuracy 
is support vector machine (SVM) has 68%. 


Keywords: Machine Learning; Heart Disease Prediction; Association Rules; Neutrosophic AHP; 
Feature Selection; Accuracy. 


1. Introduction 


The worldwide burden of morbidity and death due to cardiovascular disease continues to be 
high. Preventative measures, optimal therapeutic approaches, and a decrease in adverse 
cardiovascular events may all benefit greatly from the early identification and precise prediction of 
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persons at risk for heart disease. The early identification of people at risk for cardiovascular disease 
has been the subject of a great deal of study over the years, leading to the development of prediction 
models and risk assessment approaches. This study examines the state of the art in predicting 
cardiovascular illness and discusses the obstacles, opportunities, and future paths that lie ahead [1, 
2). 

Heart disease, which includes coronary artery disease, myocardial infarction, and heart failure, 
is a complicated multifactorial ailment impacted by a wide range of hereditary, environmental, and 
behavioral variables. Understanding these characteristics and how they interact is crucial for 
accurately predicting an individual's risk of developing heart disease [3, 4]. The risk of cardiovascular 
disease may be estimated using conventional risk assessment models like the Framingham Risk 
Score, which takes into account variables including age, gender, blood pressure, cholesterol levels, 
and smoking status. Despite their usefulness, these models often employ a small number of variables 
and may fail to capture important interplays between potential dangers. 

Novel methodologies using machine learning, artificial intelligence, and big data analytics have 
emerged as powerful instruments for cardiac disease prediction thanks to the development of 
technology and the availability of large-scale healthcare data. These methods may one day be able to 
analyze massive volumes of data, unearth previously unknown patterns, and provide unique risk 
assessments for each individual user. Predictive models for cardiovascular illness have been 
progressively developed using machine learning methods such as logistic regression (LR), decision 
trees, random forests, support vector machines (SVMs), and neural networks. Clinical, genetic, 
lifestyle, and imaging data may all be included in these algorithms to provide solid models for precise 
risk assessment [3, 5]. 

There has been a lot of interest in incorporating genetic data into heart disease prediction 
algorithms. Individual vulnerability to heart disease is heavily influenced by genetic variables, and 
the addition of genetic markers may improve the accuracy and precision of prediction algorithms. 

Wearable technology, such as activity trackers and smartwatches, may provide new information 
for predicting cardiovascular disease. For risk assessment and early diagnosis of cardiac disorders, 
these devices can constantly monitor physiological indicators including heart rate, activity levels, and 
sleep patterns [6, 7]. 

Electronic health records (EHRs) are increasingly being used as a reliable tool for predicting 
cardiovascular issues. EHRs are an invaluable resource for building accurate risk assessment models 
because they include so much information about patients. Although there have been improvements 
in heart disease prediction, there are still certain issues that require fixing. There are a number of 
obstacles that must be removed before predictive models can be widely used in clinical settings. These 
include data quality and standardization, interpretability of machine learning models, privacy 
concerns, and bias and fairness in predictive algorithms [8, 9]. 

Predicting cardiovascular illness raises important ethical questions. To keep patients confident in 
their healthcare professionals, it is critical that they respect their privacy, get their agreement before 
using predictive models, and share their results openly. In order to enhance patient outcomes and 
lessen the burden of cardiovascular illness, heart disease prognosis is a fast-developing subject with 
enormous promise. This study aims to improve cardiovascular care by fostering the creation of more 
precise, accessible, and individually tailored risk assessment tools by critically examining existing 
predictive models, addressing challenges, and exploring emerging technologies [10, 11]. 

This paper used three tools to predict heart disease, first step we used the neutrosophic analytical 
hierarchy process (AHP) as a feature section to select the best feature [12]. Then in the second step, 
we used the association rules to fined rules between variables in the data set. In the third step, we 
used machine learning models to predict the disease. Figure 1 shows the overall three steps to predict 
heart disease. 

The rest of this paper is organized as follows: Section 2 introduces the challenges in heart disease 
prediction. Section 3 introduces the methodology of this paper and has three layers including 
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neutrosophic AHP as a feature selection, association rules, and machine learning models. Section 4 
presents the results and analysis of the dataset. Section 5 introduces the managerial implications of 
heart disease prediction. Finally, Section 6 presents the conclusion of this paper. 


Neutrosophic AHP as a Feature Selection Association Rules 
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Figure 1. The overall steps of the proposed model to predict heart disease. 
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2. Heart Disease Prediction 


Public health systems face substantial difficulties from cardiovascular disease, which remains a 
primary cause of morbidity and death globally. In order to adopt preventative measures, optimize 
treatment options, and reduce the burden of cardiovascular events, early identification and precise 
prediction of those at risk is critical. Predictive models and risk assessment approaches that help in 
the early detection of heart disease susceptibility have been the subject of intensive research and 
development in recent years. The purpose of this study is to present an in-depth analysis of the 
current status of cardiac disease prediction, including its successes, failures, and prospective future 
developments [13, 14]. 

Integration of demographics, medical history, lifestyle choices, and clinical biomarkers allows 
for more accurate prediction of cardiovascular disease. To calculate an individual's risk of 
cardiovascular disease, doctors have traditionally used risk assessment models like the Framingham 
Risk Score. The advent of technology and the availability of massive quantities of healthcare data, 
however, has led to the development of creative methodologies that use machine learning algorithms, 
artificial intelligence, and big data analytics to provide more precise and individual predictions. 

Researchers and medical practitioners encounter a number of obstacles while attempting to 
foresee cases of heart disease. Among these difficulties are: 

The accuracy and quality of the data used in heart disease prediction models are crucial. 
However, the accuracy, consistency, and completeness of data might vary widely depending on the 
source. To maintain the consistency and accuracy of prediction models, it is important to take data 
quality and standardization into account when integrating data from several sources, such as 
electronic health records, wearable devices, and genetic databases [15, 16]. 
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While machine learning algorithms are useful for making predictions, they are not always easy 
to understand. Some models have a black box quality that makes it hard to decipher what is really 
driving forecasts. Gaining an understanding of the prediction process, fostering confidence among 
healthcare professionals, and aiding sound decision-making all depend on having access to 
interpretable models. 

Prediction algorithms for cardiovascular disease depend on highly private medical information. 
It is critical that personal information about patients be kept private and that data be kept secure. 
Predicting cardiovascular illness is complicated by the need to protect individual privacy while yet 
providing researchers with access to necessary data [17, 18]. 

Fairness and Bias: Predictive methods may unwittingly amplify existing biases in the training 
data. Predicting cardiac disease may be difficult because of racial, ethnic, socioeconomic, and gender 
biases in healthcare. To guarantee fair and objective forecasts for everyone, it is essential to address 
and mitigate these biases. 

External validation and generalizability Predictive models built for one population or healthcare 
system may not be applicable to another. To evaluate the efficacy and applicability of models, it is 
essential to conduct external validation in a variety of populations. The issue of designing models 
that work well for a wide range of users and contexts persists. 

Dynamic variables that change over time have an impact on heart disease, as shown by 
longitudinal studies. Changes in risk variables, illness progression, and response to therapy are all 
important to account for in predictive models. Predicting the onset of cardiac disease is difficult since 
it requires taking into account both static and dynamic factors. 

Including Genetic Data: Many people's predisposition to developing heart disease is determined 
by their genes. The precision and accuracy of prediction models may be improved by including 
genetic information in their construction. However, there are obstacles such as the difficulty in 
analyzing genetic data, the need for big genetic databases, and the ethical concerns with genetic 
testing and privacy [17, 19]. 

Fewer people from underrepresented groups have been included in heart disease research and 
data collection, for example, people of color. This underrepresentation may impair the development 
of specific risk assessment models for different groups and lead to discrepancies in forecast accuracy. 
Important steps towards a solution include filling up data gaps and ensuring research is inclusive 
[20, 21]. 

Improving the precision, fairness, and practicality of heart disease prediction algorithms 
depends on resolving these issues. We want to overcome these obstacles by creating highly accurate 
prognostic tools for the effective prevention and treatment of cardiovascular disease. 


3. Methodology 


This section has three layers. First, the neutrosophic AHP used as a feature selection is used to 
select the best feature in the dataset. Then, we used the association rules to find the rules between 
data. Finally, we applied nine machine learning models to predict heart disease. 


3.1 Neutrosophic AHP as a Feature Selection 

In order to choose which characteristics should be included in a model for predicting heart 
disease, neutrosophic AHP feature selection is used. The goal of neutrosophic feature selection is to 
deal with data uncertainty and imprecision by giving each feature a degree of membership [22]. This 
permits the characteristics most helpful to the model's prediction ability to be chosen, with their 
neutrosophic nature taken into account. By zeroing in on the most relevant characteristics, the 
accuracy and interpretability of heart disease prediction models may be increased by utilizing 
neutrosophic AHP feature selection approaches. We used the neutrosophic AHP method as a feature 
selection [23-25]. 
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Each input layer is given due consideration using the AHP technique as a means of producing a 
well-informed choice. When employing the AHP, you may use both quantitative and qualitative data 
because of the hierarchical structure provided by comparing each criterion. The AHP technique 
allows for a rating scale from 1 to 9 for any given set of data. 

When thinking about the first issue, the AHP method works well. This is due to the fact that 
AHP approaches may rank competing criteria in order of preference based on contextual factors. 
Indicators used in selecting choices may also be affected by the structure of the regional forwarding 
network [26]. The optimal size of a collection of cooperative candidates for a relay is the second open 
question. Cooperative candidate relay sets may include groups of nearby nodes with varying data 
redundancy rates, cooperative relay delays, and delivery ratios. One of the sets of a certain size is 
deemed the cooperative candidate relay set after being evaluated based on its characteristics, 
compatibility with the vehicular environment, and good trade-off among the necessary aspects. 


Step 1. The hierarchical analysis between features in dataset is performed. 

The hierarchal used to define the goal from the problem, and define the features. 
Step 2. Build pairwise comparison matrix. 

We used the triangular neutrosophic scale to evaluate the features [27]. 
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Where a, refers to the triangular neutrosophic number, n refers to the number of criteria, t refers 
the decision makers. 

Step 3. Obtain the crisp value. 

We used the score function to obtain the crisp value [27]. 

Step 4. Combine the opinions of experts. 

We used the average method to combine the different pairwise comparison matrix into one matrix. 
Step 5. Compute the row average. 
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Step 6. Normalize the crisp values. 
m—_ Wi 
wP = ar 3) 
Step 7. Compute the consistency ratio (CR). 
cl 
CR ~ (4) 
_Max —-n 
cl = == (5) 


Where A_max refers to the weighted sum vector. 


3.2 Association Rules 

In order to model and uncover the interdependencies between database entries, association rules 
are used. Support, confidence, and lift are criteria to show the importance of associations [28-31]. 
3.2.1 Support 

This metric provides insight into how often a certain collection of products appears in all trades. 
Let's pretend that Set1 is bread and Set2 is shampoo. There will be a lot more bread purchases than 
shampoo purchases. You correctly predicted that the support for setl would be greater than that for 
set2. Let's say setl is "bread and butter" and set2 is "bread and shampoo." Bread and butter are 
common cart items, but how often do you see bread and shampoo? Not really. In this situation, set1 
is more likely to be preferred than set2 in terms of popularity. In mathematical terms, the amount of 
backing for an item set is the share of all transactions that include those objects. 


Transactions containing both x and y (6) 


Support{{x} > {y}} = 


total number of transactions 
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Using support value, we can determine which rules are worth investigating further. If there are 
10,000 transactions, for instance, it may be useful to focus on the subset of item sets that appears at 
least 50 times, or has support = 0.005. Without further data, we cannot make any firm conclusions 
about the nature of the relationships among the items in a very poorly supported item set. 

3.2.2 Confidence 

This metric describes the probability that the consequent will be present on the cart, assuming 
that the antecedents are present. That is to say, of all the purchases that included the term "Captain 
Crunch," how many also included the word "Milk?" It's well known that the "Captain Crunch" vs. 
"Milk" guideline should be taken very seriously. Confidence, in technical terms, is the chance that the 
consequent will occur given the antecedent. 


transactions containing both x and y 


Confidence (ix} = ty) 7 transaction containing x 7) 
First, let's take a moment to think about a few additional situations. How sure are you that 

"Butter" and "Bread" are synonymous? To clarify, what percentage of purchases included both butter 
and bread? Extremely high, or very near to 1? Yeah, you nailed it. What about milk and yogurt? Back 
on top of the world. Milk for your toothbrush? Still unsure? Since "Milk" is such a common 
commodity, it is safe to assume that this rule will always hold true. 
3.2.3 Lift 

When determining the conditional probability of occurrence of Y given X, Lift accounts for the 
support (frequency) of consequent. The word "lift" is used to describe this metric rather literally. 
Imagine this as the *boost* to our self-assurance that comes from having Y in the shopping basket 
thanks to the presence of X. To restate, lift is the increase in the chance of Y being on the cart due to 
the knowledge of X's existence relative to the probability of Y being on the cart due to ignorance of 
X's presence. 


__ (transactions containing both x and y)/ (Transaction containing x) 
SA eee ee 


Lift (3 (8) 


Fraction of transactions containing y 
3.3 Machine Learning Algorithms 

Classification is a supervised learning technique in machine learning; it also denotes a predictive 
modelling challenge in which a class label is predicted for an input sample. Specifically, it is a 
mathematical function (f) that maps input variables (X) to target variables (Y), where Y might be a 
label or category. It may be performed on either structured or unstructured data to make predictions 
about the class of provided data items. Examples of classification the heart disease. 

Classification problems with just two possible answers (true or false) are known as "binary 
classification." For example, in a job requiring binary classification, "normal" may be one class and 
"abnormal" another. As an example, if the work at hand includes a medical test, and the result is 
"cancer not detected," then "cancer detected" may be seen as the aberrant condition. In the same way, 
the "spam" and "not spam" categories used by email service providers are also regarded to be binary 
[23, 33]. 

The machine learning and data science field is rife with suggested categorization methods. The 
most widely-used approaches to predicting heart disease are summed up here. 

3.3.1 Naive Bayes 

By using Bayes' theorem under the premise of feature independence, the naive Bayes (NB) 
algorithm is developed. In many practical applications, such as document or text categorization, 
spam filtering, etc., it performs admirably and may be used for both binary and multi-class categories. 
The NB classifier may be used to efficiently categorize the data’s noisy examples and build a solid 
prediction model. The main advantage is that it just requires a minimal amount of training data to 
rapidly and accurately estimate the required parameters, in contrast to more complex methods. 
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However, it makes very strong assumptions about the independence of characteristics, which might 
reduce its performance. Common NB classifier versions include the Gaussian, Multinomial, 
Complement, Bernoulli, and Categorical distributions [34, 35]. 
3.3.2 Logistic Regression 

LR is another popular probabilistic-based statistical model used to address classification problems 
in machine learning. A logistic function, often known as the sigmoid function from its mathematical 
definition, is commonly used in LR to assess probabilities. Overfitting is possible with high- 
dimensional data, and it performs best when the data can be linearly partitioned. In these cases, 
regularization (L1 and L2) methods may be employed to prevent over-fitting. The linearity 
assumption between the dependent and independent variables is seen as a fundamental limitation of 
LR. Although it is more typically employed for classification difficulties, it may also be used for 
regression issues [36]. 


1 
1+exp(-r) 


LR(r) = (9) 
3.3.3 _K-Nearest Neighbors 

Known as a "lazy learning” method, k-nearest neighbors (KNN) is a kind of "instance-based 
learning” or non-generalizing learning. Rather than concentrating on building a single, overarching 
model, it maintains an n-dimensional database of all occurrences that correlate to training data. 
Similarity metrics (such as the Euclidean distance function) are used by KNN to classify fresh data 
points. Each point is assigned to a category based on a majority decision of its k closest neighbors. 
The accuracy is data-dependent, however, it is quite tolerant to noisy training data. Choosing the 
right number of neighbors to use might be challenging when using KNN. KNN is versatile, since it 
may be used for both classification and regression [37]. 
3.3.4 Support Vector Machine 

SVMs are another prominent machine learning technology that may be used for classification, 
regression, and other applications. A SVM builds a hyper-plane or series of hyper-planes in high or 
infinite dimensional space. Since, in general, the larger the margin, the smaller the classifier's 
generalization error, it stands to reason that the hyper-plane, which has the largest distance from the 
closest training data points in each class, achieves a strong separation. It works well in high- 
dimensional spaces and exhibits varying behaviors depending on the kernel function used. Common 
kernel functions used in SVM classifiers include linear, polynomial, radial basis function (RBF), 
sigmoid, etc. SVM operates poorly, however, when there is more noise in the data set, such as when 
the target classes overlap [38, 39]. 
3.3.5 Decision Tree 

One popular kind of supervised learning that does not rely on parameters is the decision tree (DT). 
Both the classification and regression jobs employ DT learning techniques. Popular DT algorithms 
include ID3, C4.5, and CART. And in the relevant application fields, such as user behavior analytics 
and Cybersecurity analytics, the newly suggested BehavDT and IntrudTree by Sarker et al. are 
successful. In order to categorize the instances, DT sorts the tree from its root node to a subset of its 
leaf nodes. Classifying instances involves traversing a tree from its root node to the leaf nodes along 
the branches that correspond to the attributes being checked. The Gini impurity and the entropy gain 
are two of the most often used metrics for partitioning [40]. 
We can define entropy and Gini as: 
H(x) = — Lies p(x) logs pi) (10) 
E=1-Yeupi (11) 
3.3.6 Random Forest 

Well-known in the fields of machine learning and data science, random forest classifiers are 
employed as an ensemble classification approach. In this technique, "parallel ensemble" is used to 
simultaneously train several decision tree classifiers on independent subsamples of the data set, with 
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the final result being determined by a vote or an average of the results. As a result, it improves 
prediction accuracy and regulates the issue of over-fitting. That's why it's more common for the 
random forest (RF) learning model to outperform those using a single decision tree. It uses a hybrid 
of bootstrap aggregation (bagging) and random feature selection to construct several decision trees 
with intentional variety. It works well with both categorical and continuous data and may be used 
for classification and regression issues [41, 42]. 
3.3.7. AdaBoost 

Adaptive Boosting (AdaBoost) is an iterative ensemble learning procedure that uses error 
feedback to improve underperforming classifiers. This concept, dubbed "meta-learning" after its 
creators Yoav Freund et al. AdaBoost employs a "sequential ensemble," in contrast to the random 
forest's parallel ensemble. In order to achieve a decent classifier with high accuracy, it combines 
multiple underperforming classifiers to produce a powerful classifier. AdaBoost is an adaptive 
classifier since it greatly improves the classifier's efficiency; yet, it might lead to overfits in certain 
situations. AdaBoost is sensitive to noisy data and outliers, making it best utilized to improve the 
performance of decision trees, the basis estimator, for binary classification tasks [43]. 
3.3.8 Gradient Boosting 

Similar to the RFs example up top, Gradient Boosting is a kind of ensemble learning method that 
builds a final model from a collection of smaller models (usually decision trees). Like how neural 
networks employ gradient descent to optimize weights, we use the gradient to minimize the loss 
function [44]. 
3.3.9 Bagging 

The model is comprised of homogenous weak learners, who acquire knowledge in isolation and 
in parallel, and then average their results. Bagging, or Bootstrap Aggregating, is a meta-algorithm for 
machine learning ensembles that increases the reliability and precision of statistical classification and 
regression models. The variance is reduced and overfitting is prevented. Typically, this is used in 
decision tree techniques. The method of bagging is a variant of the model-averaging strategy [45]. 


4. Results and analysis 


This section summarizes the analysis of heart disease data and the obtained results from the 
various machine learning algorithms. 


4.1 Description of Dataset 

The information may be accessed by the general public on the Kaggle website. It was collected 
as part of an ongoing cardiovascular research on people living in the town of Framingham, which is 
located in the state of Massachusetts. The information about the patients may be found in the dataset. 
It consists of nearly 4,000 rows and fifteen different qualities. In furthermore, the different statistical 
results for the dataset's input parameters are displayed in Table 1, including the count, mean, 
standard deviation, minimum, 25%, 50%, 75%, and maximum values. 


Table 1. The statistics values of the attributes in heart disease data. 


Statistics sex cp trestbps chol fbs restecg thalach 
count 1025.000 1025.000 1025.000 1025.000 1025.000 1025.000 1025.000 
mean 54.434 0.696 0.942 131.612 246.000 0.149 0.530 

Std. 9.072 0.460 1.030 17.517 51593 0.357 0.528 
Min 29.000 0.000 0.000 94.000 126.000 0.000 0.000 
25% 48.000 0.000 0.000 120.000 211.000 0.000 0.000 
50% 56.000 1.000 1.000 130.000 240.000 0.000 1.000 
75% 61.000 1.000 2.000 140.000 275.000 0.000 1.000 
Max 77.000 1.000 3.000 200.000 564.000 1.000 2.000 
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Statistics exang oldpeak slope ca thal Target 
count 1025.000 1025.000 1025.000 1025.000  1025.000 1025.000 
mean 149.114 0.337 1.072 1.385 0.754 2.324 
Std. 23.006 0.473 1.175 0.618 1.031 0.621 
Min 71.000 0.000 0.000 0.000 0.000 0.000 
25% 132.000 0.000 0.000 1.000 0.000 2.000 
50% 152.000 0.000 0.800 1.000 0.000 2.000 
75% 166.000 1.000 1.800 2.000 1.000 3.000 
Max 202.000 1.000 6.200 2.000 4.000 3.000 


Figure 2 shows the data of sex and target columns. Where red color refers to the female and blue color 
refers to male. 0 refer to the target class no disease and 1 refers to the target class 1 has a disease. The 
number persons of male greater than female in 0 class. Also in 1 class the number rows in male greater 
than female. 


target 


Figure 2. The sex and target columns. 


Figure 3 shows the scatter diagram of data in age and cholesterol columns. Where the red color refers 
to the disease and blue color refers to no disease. The age between 30 and 40 years old have disease 


more than no disease. 
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Figure 3. The scatter diagram of age and cholesterol. 


El-Douh et al., Heart Disease Prediction under Machine Learning and Association Rules under Neutrosophic Environment 


Neutrosophic Systems with Applications, Vol. 10, 2023 44 
An International Journal on Informatics, Decision Science, Intelligent Systems Applications 


Figure 4 shows the heatmap and correlation in dataset. In the age row, there are six criteria are 
negative correlation and other are positive correlation. The ca criterion is the highly positive 
correlated with the age criterion. The age criterion has a negative correlation with the target variable. 
In the sex criterion, there are 8 negative correlation criteria and 5 positive correlation criteria. The sex 
criterion has a negative correlation with the target variable. The thal is the highly correlated with the 
sex variable. In cp variable, there are six variables positive correlated and other are negative 
correlated. The cp has a positive correlation with the target variable. The cp variable is the most 
correlated variable with the target variable. In the trestbps, there are 7 positive correlated variables 
and other are negative correlated variable. Trestbps has a negative correlation with the target 
variable. In the chol, there are 7 positive correlated variables and other are negative correlated 
variable. Chol has a negative correlation with the target variable. In the fbs, there are 8 positive 
correlated variables and other are negative correlated variable. fbs has a negative correlation with the 
target variable. In the restecg, there are 4 positive correlated variables and other are negative 
correlated variable. restecg has a positive correlation with the target variable. In the thalach, there are 
4 positive correlated variables and other are p correlated variable. thalach has a positive correlation 
with the target variable. In the exang, there are 8 positive correlated variables and other are negative 
correlated variable. exang has a negative correlation with the target variable in the oldpeak, there are 
8 positive correlated variables and other are negative correlated variable. oldpeak has a negative 
correlation with the target variable. In the slope, there are 4 positive correlated variables and other 
are negative correlated variable. Slope has a positive correlation with the target variable. In the ca, 
there are 8 positive correlated variables and other are negative correlated variable. ca has a negative 
correlation with the target variable. In the thal, there are 7 positive correlated variables and other are 
negative correlated variable. thal has a negative correlation with the target variable. 

In all variables there are four variables are positive correlated with the target variable and all 
other variables are negative correlated. The variables have positive correlation with the target 
variable are (cp, restecg, thalach, and slope). Between four variables, the cp is the largest positive 
correlated with the target variable. So, the cp, restecg, thalach, and slope have an association 
correlation with the target variable. 
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Figure 4. The heatmap in the dataset. 
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4.2 Neutrosophic AHP as a Feature Selection 

We build the comparison matrix between 13 features. This matrix contains the triangular 
neutrosophic number. Table 2 shows the triangular neutrosophic numbers for 13 features. Then 
replace these numbers with the crisp values [27]. Then compute the normalization matrix. Then 
compute the weights of features as shown in Figure 5. 


Table 2. Comparison matrix between 13 features. 


HDF HDF: HDF; 
HDF 1 ((1,1,1) ;0.50,0.50,0.50) ((6,7,8) ;0.90,0.10,0.10) 
HDF: 1/((1,1,1) ;0.50,0.50,0.50) 1 ((4,5,6) ;0.80,0.15,0.20) 
HDF; _1/((6,7,8) ;0.90,0.10,0.10) 1/((4,5,6) ;0.80,0.15,0.20) 1 

HDF: —1/((1,2,3) ;0.40,0.65,0.60) 1/((6,7,8) ;0.90,0.10,0.10) 1((2,3,4) ;0.30,0.75,0.70) 
HDEs —_1/((2,3,4) ;0.30,0.75,0.70) 1/((9,9,9) ;0.100,0.00,0.00) 1/((4,5,6) ;0.80,0.15,0.20) 
HDFs —_1/((2,3,4) ;0.30,0.75,0.70) 1/((3,4,5) ;0.60,0.35,0.40) 1/((1,1,1) ;0.50,0.50,0.50) 
HDE; _1/((1,2,3) ;0.40,0.65,0.60) 1/((3,4,5) ;0.60,0.35,0.40) 1/((1,2,3) ;0.40,0.65,0.60) 
HDFs —_1/((9,9,9) ;0.100,0.00,0.00) ~—_1/((9,9,9) ;0.100,0.00,0.00) 1/((5,6,7) ;0.70,0.25,0.30) 
HDFs _—_1/((7,8,9) ;0.85,0.10,0.15) 1/((6,7,8) ;0.90,0.10,0.10) 1/((7,8,9) ;0.85,0.10,0.15) 
HDF. ——_1/((3,4,5) ;0.60,0.35,0.40) 1/((4,5,6) ;0.80,0.15,0.20) 1/((3,4,5) ;0.60,0.35,0.40) 
HDFu —_1/((5,6,7) ;0.70,0.25,0.30) 1/((2,3,4) ;0.30,0.75,0.70) 1/((9,9,9) ;0.100,0.00,0.00) 
HDEx —1/((4,5,6) ;0.80,0.15,0.20) 1/((4,5,6) ;0.80,0.15,0.20) 1/((1,2,3) ;0.40,0.65,0.60) 
HDB: —1/((4,5,6) ;0.80,0.15,0.20) 1/((1,1,1) ;0.50,0.50,0.50) 1/((1,1,1) ;0.50,0.50,0.50) 

HDF: HDEs HDF. 

HDF ((1,2,3) ;0.40,0.65,0.60) ((2,3,4) ;0.30,0.75,0.70) ((2,3,4) ;0.30,0.75,0.70) 
HDF ((6,7,8) ;0.90,0.10,0.10) ((9,9,9) ;0.100,0.00,0.00) ((3,4,5) ;0.60,0.35,0.40) 
HDF; ((2,3,4) ;0.30,0.75,0.70) ((4,5,6) ;0.80,0.15,0.20) ((1,1,1) ;0.50,0.50,0.50) 
HDF: 1 ((5,6,7) ;0.70,0.25,0.30) ((3,4,5) ;0.60,0.35,0.40) 
HDFs —1/((5,6,7) ;0.70,0.25,0.30) 1 ((3,4,5) ;0.60,0.35,0.40) 
HDFs —1/((3,4,5) ;0.60,0.35,0.40) 1/((3,4,5) ;0.60,0.35,0.40) i 

HDF, —1/((1,2,3) ;0.40,0.65,0.60) 1/((5,6,7) ;0.70,0.25,0.30) 1/((7,8,9) :0.85,0.10,0.15) 
HDFs —_1/((1,2,3) ;0.40,0.65,0.60) 1/((7,8,9) ;0.85,0.10,0.15) 1/((5,6,7) ;0.70,0.25,0.30) 
HDFs ——1/((1,1,1) ;0.50,0.50,0.50) 1/((6,7,8) ;0.90,0.10,0.10) 1/((6,7,8) ;0.90,0.10,0.10) 
HDF —_1/((1,2,3) ;0.40,0.65,0.60) 1/((1,1,1) ;0.50,0.50,0.50) 1/((4,5,6) ;0.80,0.15,0.20) 
HDEn —_1/((5,6,7) ;0.70,0.25,0.30) 1/((3,4,5) ;0.60,0.35,0.40) 1/((9,9,9) ;0.100,0.00,0.00) 
HDFe —1/((1,1,1) ;0.50,0.50,0.50) 1/((7,8,9) ;0.85,0.10,0.15) 1/((5,6,7) ;0.70,0.25,0.30) 
HDF —_1/((1,2,3) ;0.40,0.65,0.60) 1/((2,3,4) ;0.30,0.75,0.70) 1/((3,4,5) ;0.60,0.35,0.40) 

HDF; HDEs HDF» 

HDF ((1,2,3) ;0.40,0.65,0.60) ((9,9,9) ;0.100,0.00,0.00) ((7,8,9) ;0.85,0.10,0.15) 
HDF: ((3,4,5) ;0.60,0.35,0.40) ((9,9,9) ;0.100,0.00,0.00) ((6,7,8) ;0.90,0.10,0.10) 
HDF; ((1,2,3) ;0.40,0.65,0.60) ((5,6,7) ;0.70,0.25,0.30) ((7,8,9) ;0.85,0.10,0.15) 
HDF: ((1,2,3) ;0.40,0.65,0.60) ((1,2,3) ;0.40,0.65,0.60) ((1,1,1) ;0.50,0.50,0.50) 
HDEs ((5,6,7) ;0.70,0.25,0.30) ((7,8,9) ;0.85,0.10,0.15) ((6,7,8) ;0.90,0.10,0.10) 
HDEs ((7,8,9) ;0.85,0.10,0.15) ((5,6,7) ;0.70,0.25,0.30) ((6,7,8) ;0.90,0.10,0.10) 
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HDF, 1 ((3,4,5) ;0.60,0.35,0.40) ((5,6,7) ;0.70,0.25,0.30) 

HDFs 1/((3,4,5) ;0.60,0.35,0.40) 1 ((3,4,5) ;0.60,0.35,0.40) 

HDFs 1/((5,6,7) ;0.70,0.25,0.30) 1/((3,4,5) ;0.60,0.35,0.40) 1 

HDF 1/((2,3,4) ;0.30,0.75,0.70) 1/((6,7,8) ;0.90,0.10,0.10) 1/((9,9,9) ;0.100,0.00,0.00) 

HDFu 1/((6,7,8) ;0.90,0.10,0.10) 1/((1,1,1) ;0.50,0.50,0.50) 1/((1,2,3) ;0.40,0.65,0.60) 

HDF2 1/((3,4,5) ;0.60,0.35,0.40) 1/((9,9,9) ;0.100,0.00,0.00) 1/((6,7,8) ;0.90,0.10,0.10) 

HDF. 1/((5,6,7) ;0.70,0.25,0.30) 1/((7,8,9) ;0.85,0.10,0.15) 1/((3,4,5) ;0.60,0.35,0.40) 
HDF HDFu HDF 

HDF: ((3,4,5) ;0.60,0.35,0.40) ((5,6,7) ;0.70,0.25,0.30) ((4,5,6) ;0.80,0.15,0.20) 

HDF? ((4,5,6) ;0.80,0.15,0.20) ((2,3,4) ;0.30,0.75,0.70) ((4,5,6) ;0.80,0.15,0.20) 

HDF: ((3,4,5) ;0.60,0.35,0.40) ((9,9,9) ;0.100,0.00,0.00) ((1,2,3) ;0.40,0.65,0.60) 

HDF: ((1,2,3) ;0.40,0.65,0.60) ((5,6,7) ;0.70,0.25,0.30) ((1,1,1) ;0.50,0.50,0.50) 

HDF; ((1,1,1) ;0.50,0.50,0.50) ((3,4,5) ;0.60,0.35,0.40) ((7,8,9) ;0.85,0.10,0.15) 

HDF. ((4,5,6) ;0.80,0.15,0.20) ((9,9,9) ;0.100,0.00,0.00) ((5,6,7) ;0.70,0.25,0.30) 

HDF, ((2,3,4) ;0.30,0.75,0.70) ((6,7,8) ;0.90,0.10,0.10) ((3,4,5) ;0.60,0.35,0.40) 

HDFs ((6,7,8) ;0.90,0.10,0.10) ((1,1,1) ;0.50,0.50,0.50) ((9,9,9) ;0.100,0.00,0.00) 

HDF» ((9,9,9) ;0.100,0.00,0.00) ((1,2,3) ;0.40,0.65,0.60) ((6,7,8) ;0.90,0.10,0.10) 

HDF 1 ((1,1,1) ;0.50,0.50,0.50) ((4,5,6) ;0.80,0.15,0.20) 

HDFu 1/((1,1,1) ;0.50,0.50,0.50) 1 ((2,3,4) ;0.30,0.75,0.70) 

HDF2x 1/((4,5,6) ;0.80,0.15,0.20) 1/((2,3,4) ;0.30,0.75,0.70) 1 

HDF. 1/((3,4,5) ;0.60,0.35,0.40) 1/((1,1,1) ;0.50,0.50,0.50) 1/((1,2,3) ;0.40,0.65,0.60) 

HDFi3 

HDF: ((4,5,6) ;0.80,0.15,0.20) 

HDF: ((1,1,1) ;0.50,0.50,0.50) 

HDF; ((1,1,1) ;0.50,0.50,0.50) 

HDF: ((1,2,3) ;0.40,0.65,0.60) 

HDFs ((2,3,4) ;0.30,0.75,0.70) 

HDFe ((3,4,5) ;0.60,0.35,0.40) 

HDF, ((5,6,7) ;0.70,0.25,0.30) 

HDFs ((7,8,9) ;0.85,0.10,0.15) 

HDFs ((3,4,5) ;0.60,0.35,0.40) 

HDFio ((3,4,5) ;0.60,0.35,0.40) 

HDFu ((1,1,1) ;0.50,0.50,0.50) 

HDF2 ((1,2,3) ;0.40,0.65,0.60) 

HDF. 1 
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Figure 5. Weights of 13 features. 


4.3 Association Rules 
Table 3 shows the association rules between the target and other variables. Table 3 presents the 
support, confidence, and lift values. 
Table 3. Comparison matrix between 13 features. 


Column Target | antecedent | consequent 
name in Support | confidence lift leverage | Conviction 
dataset class support support 
‘ 0 0.8537 0.9756 0.8293 0.9714 0.9957 | -0.0036 0.8537 
e 
i 1 0.9756 0.8537 0.8293 0.8500 0.9957 | -0.0036 0.9756 
0 1.0 1.0 1.0 1.0 1.0 0.0 Inf 
Sex 
1 1.0 1.0 1.0 1.0 1.0 0.0 Inf 
a 0 1.0 1.0 1.0 1.0 1.0 0.0 Inf 
1 1.0 1.0 1.0 1.0 1.0 0.0 Inf 
pss 0 0.7755 0.7959 0.5714 0.7368 0.9258 | -0.0458 0.7755 
restbps 
r 1 0.7959 0.7755 0.5714 0.7179 0.9258 | -0.0458 0.7959 
ze 0 1.0 1.0 1.0 1.0 1.0 0.0 Inf 
s 
1 1.0 1.0 1.0 1.0 1.0 0.0 Inf 
0 1.0 1.0 1.0 1.0 1.0 0.0 Inf 
restecg 
1 1.0 1.0 1.0 1.0 1.0 0.0 Inf 
desis 0 0.7363 0.7802 0.5165 0.7015 0.8991 | -0.058 0.7363 
alac 
1 0.7802 0.7363 0.5165 0.6620 0.8991 | -0.058 0.7802 
0 1.0 1.0 1.0 1.0 1.0 0.0 Inf 
exang 
1 1.0 1.0 1.0 1.0 1.0 0.0 Inf 
0 0.650 0.875 0.525 0.8077 0.9231 | -0.0437 0.650 
oldpeak 
1 0.875 0.650 0.525 0.6000 0.9231 | -0.0437 0.875 
slope 0 1.0 1.0 1.0 1.0 1.0 0.0 Inf 
1 1.0 1.0 1.0 1.0 1.0 0.0 Inf 
0 1.0 1.0 1.0 1.0 1.0 0.0 Inf 
ca 
1 1.0 1.0 1.0 1.0 1.0 0.0 Inf 
0 1.0 1.0 1.0 1.0 1.0 0.0 Inf 
thal 
1 1.0 1.0 1.0 1.0 1.0 0.0 Inf 
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4.4 Performance Measurements 

Every confusion matrix provides a description of the operation of a classification algorithm on a 
set of test data for which the measured values are completely understood. The confusion matrix was 
used in the computation of the parameters stated in Table 4, which may be seen below. From Table 4 
the random forest and decision tree have the best accuracy with 100% accuracy. We divide the dataset 
into train and test, the train set has 80% and the test set has 20% data. Figure 6 shows the confusion 
matrices. 


Table 4. The results of machine learning algorithms. 


Logistic Random Gradient Decision 
KNN SVM AdaBoosting Bagging NB 
Regression _ Forest Boosting Tree 
Accuracy 0.8439 1.0000 0.9805 0.6780 0.8927 0.9902 0.9756 0.8390 1.0000 
Precision 0.8155 1.0000 0.9604 0.6165 0.9121 1.0000 0.9894 0.8333 1.0000 
Recall 0.8660 1.0000 1.0000 0.8454 0.8557 0.9794 0.9588 0.8247 1.0000 


F1-score 0.8400 1.0000 0.9798 0.7130 0.8830 0.9896 0.9738 0.8290 1.0000 
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Figure 6. The confusion matrices. 
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5. Managerial Implications 


The administrative implications of heart disease prediction for healthcare organizations and 
providers are many. Among the repercussions of this are: 

Effective resource allocation is possible with the use of heart disease prediction models. 
Healthcare expenses may be reduced and resource utilization improved by targeting people at the 
highest risk via preventative measures like screens and treatments. 

Healthcare administrators are able to create more effective preventative care programs because 
of heart disease prediction. High-risk patients might be targeted for preventative measures such as 
lifestyle changes, medication management, and routine monitoring by healthcare administrators. By 
taking preventative measures, healthcare outcomes for patients and costs for healthcare systems may 
both improve. 

Workflow and Care Coordination: Prediction models for cardiovascular disease may help with 
both. Managers can pinpoint those patients most at risk and swiftly arrange them for the necessary 
preventative measures. Better patient care and results are the results of this effort to standardize care 
pathways and guarantee timely interventions. 

Patient Engagement and Education: Prediction models for cardiovascular disease may help with 
both goals. Managers may utilize prognostic data to teach patients about their unique risk factors, the 
value of sticking to their treatment regimens, and the advantages of adopting healthier habits. 
Patients’ desire and ability to make educated choices about their own heart disease prevention and 
treatment may both be improved by patient engagement. 

The efficiency of preventative measures and the quality of treatment as a whole may be tracked 
using performance metrics such as heart disease prediction models. Managers may monitor the 
progress of high-risk people to see whether the interventions they've put in place are having the 
intended effect. With this information, we can make more educated choices about how to best treat 
cardiac disease. 

Insurance firms and other payers may use heart disease prediction algorithms in risk-based 
contracts and insurance policies. Insurers may adjust customers' premiums, levels of coverage, and 
methods of payment to account for each person's unique estimated risk of cardiovascular disease by 
integrating predictive information. This method encourages individualized and economically viable 
medical protection. 

Data generated by heart disease prediction models may be utilized for scientific inquiry and 
technological advancement. Data produced by prediction models may be analyzed by managers and 
researchers together to discover new risk factors, verify current models, and improve predictive 
algorithms. Working together, we can better understand how to anticipate and treat cardiac disease. 

Predicting cardiovascular disease has broad administrative implications, including but not 
limited to budgeting, planning for preventative treatment, streamlining operations, increasing 
patient participation, enhancing product quality, reducing risk, and facilitating new studies. In the 
context of heart disease prevention and management, predictive models may help healthcare 
administrators make better choices, enhance the quality of treatment provided, and improve patient 
outcomes. 


6. Conclusions 


Predicting heart disease is important for several reasons, including bettering patient outcomes, 
maximizing resources, and permitting individualized treatment. By drawing from several data sets 
to build disease-specific prognostic models, machine learning algorithms have already shown their 
worth in this area. Better heart disease management and prevention are possible because of these 
models’ ability to stratify risk, diagnose it early, and direct treatment accordingly. 

Several administrative considerations arise from using machine learning to the problem of 
predicting cardiac disease. By focusing on those most at risk and implementing preventative 
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measures first, healthcare systems may make better use of their limited resources. Predictive models 
are used in personalized care plans to increase patient involvement and treatment compliance. Care 
coordination and optimization of workflow allow for prompt screenings and treatments for those at 
high risk. Additionally, cardiovascular disease prediction models allow for better performance 
tracking, quality enhancement, and groundbreaking new research. 

The use of machine learning algorithms for the prediction of heart disease has enormous 
potential to improve cardiovascular treatment. Risk stratification, individualized care planning, and 
early identification of cardiac disease are all made possible by these models, which make use of 
massive datasets and sophisticated computational approaches. We used the neutrosophic AHP as a 
feature selection to select the best feature, then we applied the association rules to get importance 
from the rules between datasets. Finally, we used the nine machine learning algorithms to predict 
heart disease. From our data, we know that the highest accuracy is achieved by random forests and 
decision trees (100%), then by bagging, k-nearest neighbors, and gradient boosting (98%, 97%, and 
89%, respectively), then by AdaBoosting (89%), then by logistic regression and Naive Bayes (84%), 
and finally by support vector machines (68%). 
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